Skip to content

planner: add HasTiflash to avoid generating unnecessary mpp task | tidb-test=pr/2660 #65250

Open
hawkingrei wants to merge 69 commits intopingcap:masterfrom
hawkingrei:cannot_call_checkCanPushDownToMPP_twice
Open

planner: add HasTiflash to avoid generating unnecessary mpp task | tidb-test=pr/2660 #65250
hawkingrei wants to merge 69 commits intopingcap:masterfrom
hawkingrei:cannot_call_checkCanPushDownToMPP_twice

Conversation

@hawkingrei
Copy link
Member

@hawkingrei hawkingrei commented Dec 25, 2025

What problem does this PR solve?

Issue Number: close #65335

Problem Summary:

  • checkCanPushDownToMPP was evaluated twice in hash aggregation planning and could run on non-TiFlash paths.
  • After introducing HasTiflash, some physical plan partition metadata still relied on pruned OutputNames, which could break partition pruning expression rewrite in TiFlash partition-table paths (for example Unknown column 'a' in 'expression').
  • tidb_allow_mpp=OFF could still enter MPP-related planning branches in some TiFlash paths.

What changed and how does it work?

  • Introduce and reuse rule.ReconstructTableColNames(ds) for partition pruning column-name reconstruction from ds.TblCols.
  • Add buildPhysPlanPartInfo(ds) and centralize PhysPlanPartInfo production in planner physical-task construction.
    • Applied in find_best_task.go and exhaust_physical_plans.go.
  • Gate MPP-specific paths with explicit MPP capability:
    • hasTiflashForMPP = hasTiflash && IsMPPAllowed().
  • Preserve produced partition metadata when converting MPP task to root task:
    • reuse and clone PhysPlanPartInfo in ConvertToRootTaskImpl.
  • Keep unistore MPP table-scan stop path idempotent (sync.Once) to avoid repeated close/wait in teardown paths.
  • Refresh enforcempp test expectations.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Executed:

  • go test -run TestTiflashPartitionTableScan --tags=intest ./pkg/executor/test/tiflashtest
  • go test -run TestVirtualExprPushDown --tags=intest ./pkg/planner/core
  • go test -run TestMPPSkewedGroupDistinctRewrite --tags=intest ./pkg/planner/core/casetest/enforcempp
  • go test -run TestEnforceMPP --tags=intest ./pkg/planner/core/casetest/enforcempp

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Fix incorrect TiFlash partition pruning metadata propagation after `HasTiflash` planner changes, and avoid entering MPP planning paths when MPP is disabled.

Summary by CodeRabbit

  • New Features

    • Planner now detects and propagates TiFlash availability into plan selection, improving MPP eligibility decisions.
  • Bug Fixes

    • Stricter gating to avoid invalid MPP/TiFlash pushdowns and fewer duplicate/noisy warnings in explain outputs.
    • More consistent plan/explain formatting and identifier alignment for clearer diagnostics.
  • Documentation

    • Added dated guidance on TiFlash propagation and plan-shape regression triage.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/planner SIG: Planner and removed do-not-merge/needs-tests-checked labels Dec 25, 2025
@codecov
Copy link

codecov bot commented Dec 25, 2025

Codecov Report

❌ Patch coverage is 85.91885% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.1973%. Comparing base (cb3f4eb) to head (ce213aa).
⚠️ Report is 4 commits behind head on master.

⚠️ Current head ce213aa differs from pull request most recent head 415ac9b

Please upload reports for the commit 415ac9b to get more accurate results.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #65250        +/-   ##
================================================
+ Coverage   77.6808%   79.1973%   +1.5165%     
================================================
  Files          2008       1952        -56     
  Lines        549769     532302     -17467     
================================================
- Hits         427065     421569      -5496     
+ Misses       121044     108912     -12132     
- Partials       1660       1821       +161     
Flag Coverage Δ
integration 46.4502% <59.1939%> (-1.7340%) ⬇️
tiprow_ft ?
unit 76.0945% <85.9188%> (-0.2468%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 56.7974% <ø> (ø)
parser ∅ <ø> (∅)
br 61.2188% <ø> (+0.3381%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 25, 2025
@hawkingrei hawkingrei force-pushed the cannot_call_checkCanPushDownToMPP_twice branch from 4728a02 to e3b9c98 Compare December 30, 2025 05:02
@hawkingrei
Copy link
Member Author

/retest

@hawkingrei
Copy link
Member Author

/retest

@hawkingrei hawkingrei changed the title planner: cannot call checkCanPushDownToMPP twice planner: cannot call checkCanPushDownToMPP twice | tidb-test=pr/2660 Dec 30, 2025
@hawkingrei
Copy link
Member Author

/retest

1 similar comment
@hawkingrei
Copy link
Member Author

/retest

@hawkingrei hawkingrei changed the title planner: cannot call checkCanPushDownToMPP twice | tidb-test=pr/2660 planner: don't call checkCanPushDownToMPP twice | tidb-test=pr/2660 Dec 30, 2025
@hawkingrei hawkingrei changed the title planner: don't call checkCanPushDownToMPP twice | tidb-test=pr/2660 planner: improve checkCanPushDownToMPP | tidb-test=pr/2660 Dec 30, 2025
@hawkingrei hawkingrei force-pushed the cannot_call_checkCanPushDownToMPP_twice branch from b114bb3 to e949bd4 Compare December 30, 2025 14:20
@hawkingrei
Copy link
Member Author

/retest

@hawkingrei hawkingrei force-pushed the cannot_call_checkCanPushDownToMPP_twice branch from 78144ef to 9b4552a Compare December 30, 2025 15:39
@hawkingrei hawkingrei changed the title planner: improve checkCanPushDownToMPP | tidb-test=pr/2660 planner: improve getHashAggs | tidb-test=pr/2660 Dec 30, 2025
@hawkingrei
Copy link
Member Author

/retest

@hawkingrei hawkingrei changed the title planner: improve getHashAggs | tidb-test=pr/2660 planner: add HasTiflash to avoid generating unnecessary mpp task | tidb-test=pr/2660 Dec 30, 2025
@hawkingrei
Copy link
Member Author

/retest

@hawkingrei hawkingrei force-pushed the cannot_call_checkCanPushDownToMPP_twice branch from ce213aa to 18ce622 Compare March 1, 2026 06:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
pkg/planner/core/stats.go (1)

597-599: ⚠️ Potential issue | 🟠 Major

tiflashPath can still be appended as nil in heuristic keep-path logic.

The new hasTiFlashReplica guard at Line 599 protects path derivation, but the keep branch still appends tiflashPath without re-checking availability. When TiFlash is preferred/enforced but absent, this can inject a nil path.

🔧 Suggested fix
-		keep := (ds.PreferStoreType&h.PreferTiFlash != 0 || isMPPEnforced) && selected.StoreType != kv.TiFlash
+		keep := (ds.PreferStoreType&h.PreferTiFlash != 0 || isMPPEnforced) &&
+			hasTiFlashReplica && tiflashPath != nil && selected.StoreType != kv.TiFlash
 		if keep {
 			// also keep tiflash path as well.
 			ds.PossibleAccessPaths = append(ds.PossibleAccessPaths, tiflashPath)
 			return nil
 		}

Also applies to: 701-705

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/stats.go` around lines 597 - 599, The keep-path logic may
append tiflashPath even when hasTiFlashReplica is false, causing a nil path to
be injected; update the branches that append tiflashPath (the one gated by
(ds.PreferStoreType&h.PreferTiFlash != 0 || isMPPEnforced) and the similar block
around the 701–705 region) to only append tiflashPath when hasTiFlashReplica is
true (or tiflashPath != nil), i.e., add a guard checking hasTiFlashReplica (or
tiflashPath != nil) before pushing/appending tiflashPath in the keep-path branch
so the path is never nil. Ensure the same guard is applied consistently in both
places.
pkg/planner/core/find_best_task.go (1)

90-99: ⚠️ Potential issue | 🟠 Major

Keep ColumnNames aligned with ds.TblCols in the fallback branch.

If reconstruction fails, ds.OutputNames() can have different length/order than ds.TblCols, which can corrupt partition-pruning metadata carried in PhysPlanPartInfo.

Proposed fix
 func buildPhysPlanPartInfo(ds *logicalop.DataSource) *physicalop.PhysPlanPartInfo {
 	columnNames, err := rule.ReconstructTableColNames(ds)
 	if err != nil {
 		// Keep planning resilient. If reconstructing names fails, fall back to output
 		// names to avoid aborting optimization.
-		columnNames = ds.OutputNames()
+		fallback := ds.OutputNames()
+		if len(fallback) == len(ds.TblCols) {
+			columnNames = fallback
+		} else {
+			columnNames = make(types.NameSlice, len(ds.TblCols))
+			for i := range columnNames {
+				columnNames[i] = &types.FieldName{}
+			}
+		}
 	}
 	return &physicalop.PhysPlanPartInfo{
 		PruningConds:   ds.AllConds,
 		PartitionNames: ds.PartitionNames,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/find_best_task.go` around lines 90 - 99, The fallback branch
uses ds.OutputNames() which may not match ds.TblCols length/order and can
corrupt PhysPlanPartInfo.ColumnNames; change the err != nil branch in
find_best_task.go so columnNames is derived from ds.TblCols (e.g., map each
ds.TblCols entry to its name or otherwise construct a slice with the same
length/order as ds.TblCols) instead of using ds.OutputNames(), ensuring
ColumnNames aligns with ds.TblCols when returning the
physicalop.PhysPlanPartInfo.
pkg/planner/core/base/plan_base.go (1)

402-406: ⚠️ Potential issue | 🔴 Critical

Guard nil columns in PossiblePropertiesInfo.Hash64 to prevent panic.

Equals handles nil column slots, but Line 405 dereferences col unconditionally. A nil slot will panic during hashing.

Proposed fix
 		for _, one := range info.Orders {
 			h.HashInt(len(one))
 			for _, col := range one {
-				col.Hash64(h)
+				if col == nil {
+					h.HashByte(base.NilFlag)
+					continue
+				}
+				h.HashByte(base.NotNilFlag)
+				col.Hash64(h)
 			}
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/base/plan_base.go` around lines 402 - 406,
PossiblePropertiesInfo.Hash64 currently dereferences columns in info.Orders
without nil checks, which can panic; update the loop in
PossiblePropertiesInfo.Hash64 (iterating info.Orders and each col) to guard
against nil column slots by checking if col != nil before calling col.Hash64(h)
and otherwise incorporate a stable placeholder into the hash (e.g., call
h.HashInt(0) or another deterministic value) so nil and non-nil columns produce
distinct, consistent hashes.
pkg/planner/core/operator/physicalop/physical_hash_agg.go (1)

57-62: ⚠️ Potential issue | 🟠 Major

MPP-disabled sessions can still slip into MPP hash-agg enumeration.

The early return uses canPushDownToMPP before IsMPPAllowed() is applied. That allows MPP branches to survive in some prop.TaskTp == property.MppTaskType paths.

Proposed fix
-	var canPushDownToMPP bool
-	if util2.ShouldCheckTiFlashPushDown(la.SCtx(), logicalop.GetHasTiFlash(lp)) {
+	sessionVars := la.SCtx().GetSessionVars()
+	canPushDownToMPP := false
+	if sessionVars.IsMPPAllowed() &&
+		util2.ShouldCheckTiFlashPushDown(la.SCtx(), logicalop.GetHasTiFlash(lp)) {
 		canPushDownToMPP = checkCanPushDownToMPP(la)
 	}
 	if prop.TaskTp == property.MppTaskType && !canPushDownToMPP {
 		return nil
 	}
@@
-	canPushDownToMPP = canPushDownToMPP && la.SCtx().GetSessionVars().IsMPPAllowed()
+	// `canPushDownToMPP` already includes `IsMPPAllowed()`.
 	if canPushDownToMPP {
 		taskTypes = append(taskTypes, property.MppTaskType)
 	}

Also applies to: 71-73, 89-91

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/physicalop/physical_hash_agg.go` around lines 57 -
62, The code uses canPushDownToMPP before checking session MPP permission,
letting MPP branches survive; ensure you first check IsMPPAllowed() on the
session context and only then evaluate util2.ShouldCheckTiFlashPushDown/ call
checkCanPushDownToMPP to set canPushDownToMPP. Concretely, wrap the existing
ShouldCheckTiFlashPushDown + checkCanPushDownToMPP logic behind an
IsMPPAllowed() guard (using la.SCtx() and IsMPPAllowed()), or set
canPushDownToMPP = false unless IsMPPAllowed() returns true, then compute it;
apply the same change to the other similar blocks around lines with prop.TaskTp
== property.MppTaskType (the occurrences using canPushDownToMPP).
pkg/planner/core/operator/logicalop/logical_union_all.go (1)

174-184: ⚠️ Potential issue | 🟡 Minor

HasTiflash incorrectly remains true when all children are nil.

The current logic on lines 175-180 starts with hasTiflash = true when children exist, then skips nil children. If all children are nil (e.g., [nil, nil]), hasTiflash remains true, which incorrectly enables MPP-capable propagation.

Suggested fix
 func (p *LogicalUnionAll) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
-	hasTiflash := len(childrenProperties) > 0
+	hasTiflash := false
+	hasValidChild := false
 	for _, child := range childrenProperties {
-		if child == nil {
-			continue
+		if child == nil || !child.HasTiflash {
+			hasTiflash = false
+			break
+		}
+		if !hasValidChild {
+			hasTiflash = true
+			hasValidChild = true
 		}
-		hasTiflash = hasTiflash && child.HasTiflash
 	}
 	p.hasTiflash = hasTiflash
 	return &base.PossiblePropertiesInfo{HasTiflash: p.hasTiflash}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/logicalop/logical_union_all.go` around lines 174 -
184, The method PreparePossibleProperties in LogicalUnionAll sets hasTiflash
true when childrenProperties is non-empty even if all entries are nil; change
the logic so HasTiflash is true only when there is at least one non-nil child
and every non-nil child's HasTiflash is true. In practice, inside
LogicalUnionAll.PreparePossibleProperties introduce a seenNonNil (or similar)
boolean, set it when encountering a non-nil child while computing hasTiflash =
hasTiflash && child.HasTiflash, and after the loop set hasTiflash = false if
seenNonNil is false before assigning p.hasTiflash and returning the
PossiblePropertiesInfo. Ensure you reference the existing symbols:
LogicalUnionAll.PreparePossibleProperties, childrenProperties, p.hasTiflash, and
base.PossiblePropertiesInfo{HasTiflash: ...}.
pkg/planner/core/operator/logicalop/logical_window.go (1)

416-428: ⚠️ Potential issue | 🟠 Major

Guard against nil/empty infos before accessing infos[0].

Accessing infos[0].HasTiflash without checking len(infos) > 0 && infos[0] != nil will panic if this method is ever called with empty or nil child properties. While LogicalWindow typically has one child, defensive coding prevents planner panics.

🛡️ Proposed fix
 func (p *LogicalWindow) PreparePossibleProperties(_ *expression.Schema, infos ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
-	p.hasTiflash = infos[0].HasTiflash
+	hasTiflash := false
+	if len(infos) > 0 && infos[0] != nil {
+		hasTiflash = infos[0].HasTiflash
+	}
+	p.hasTiflash = hasTiflash
 	result := make([]*expression.Column, 0, len(p.PartitionBy)+len(p.OrderBy))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/logicalop/logical_window.go` around lines 416 -
428, In PreparePossibleProperties of LogicalWindow, guard against infos being
empty or nil before reading infos[0]. Replace the unconditional read of
infos[0].HasTiflash with a check like if len(infos) > 0 && infos[0] != nil {
p.hasTiflash = infos[0].HasTiflash } else { p.hasTiflash = false } so
PreparePossibleProperties (and the Orders/HasTiflash returned) can't panic when
called with no child PossiblePropertiesInfo; keep building result from
PartitionBy/OrderBy unchanged and return the same base.PossiblePropertiesInfo
structure.
🧹 Nitpick comments (2)
pkg/planner/cascades/old/optimize.go (1)

374-377: Normalize nil exprInfo to empty info for consistency with core path.

This is optional, but replacing continue with empty-info normalization keeps old-cascades behavior closer to pkg/planner/core/property_cols_prune.go and reduces divergent nil semantics.

♻️ Suggested change
-		exprInfo := expr.ExprNode.PreparePossibleProperties(expr.Schema(), childrenProperties...)
-		if exprInfo == nil {
-			continue
-		}
+		exprInfo := expr.ExprNode.PreparePossibleProperties(expr.Schema(), childrenProperties...)
+		if exprInfo == nil {
+			exprInfo = &base.PossiblePropertiesInfo{}
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/cascades/old/optimize.go` around lines 374 - 377, When
expr.ExprNode.PreparePossibleProperties(expr.Schema(), childrenProperties...)
returns nil, don't skip the node; instead normalize to an empty
possible-properties value consistent with the core implementation: replace the
`continue` with assigning an empty instance of the same possible-properties type
that PreparePossibleProperties returns (i.e., create/assign the zero/empty
possible-properties value) to `exprInfo` so subsequent logic expects a non-nil
info object; update uses of `exprInfo` in the surrounding code in optimize.go
accordingly.
pkg/planner/core/operator/logicalop/logical_projection.go (1)

334-335: Add a defensive guard before indexing childrenProperties[0].

This function currently assumes a non-empty, non-nil child info slice. A small guard would make it more robust against future caller changes.

🛡️ Suggested hardening
 func (p *LogicalProjection) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
+	if len(childrenProperties) == 0 || childrenProperties[0] == nil {
+		p.hasTiflash = false
+		return &base.PossiblePropertiesInfo{}
+	}
 	childProperties := childrenProperties[0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/logicalop/logical_projection.go` around lines 334 -
335, Add a defensive guard at the start of
LogicalProjection.PreparePossibleProperties to avoid indexing into an empty or
nil childrenProperties slice: check if len(childrenProperties) == 0 or
childrenProperties[0] == nil and return nil (or an empty
*base.PossiblePropertiesInfo if the call-sites expect a non-nil pointer). This
touches the PreparePossibleProperties method and the childProperties variable
and ensures you don't dereference a missing child before using
base.PossiblePropertiesInfo.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/operator/logicalop/logical_aggregation.go`:
- Around line 268-285: In LogicalAggregation.PreparePossibleProperties, avoid
direct access to childrenProperties[0] by first checking
len(childrenProperties)>0; if empty, treat childProps as nil and set
la.hasTiflash = false (or preserve current default), then proceed with the
existing branches (the no-group-by branch and the childProps==nil branch).
Update references to childProps in that function accordingly so behavior is
unchanged when a child exists but safe when childrenProperties is empty.

In `@pkg/sessionctx/variable/session.go`:
- Around line 3807-3810: The HasTiflash method on SessionVars references a
non-existent field s.StmtCtx.HasTiflash; update it to use the correct
StatementContext field s.StmtCtx.IsTiFlash so the return becomes based on
(len(s.HypoTiFlashReplicas) > 1 && s.StmtCtx.InExplainStmt) ||
s.StmtCtx.IsTiFlash; modify the HasTiflash function accordingly (symbols:
SessionVars.HasTiflash, s.HypoTiFlashReplicas, s.StmtCtx.InExplainStmt,
s.StmtCtx.IsTiFlash).

---

Duplicate comments:
In `@pkg/planner/core/base/plan_base.go`:
- Around line 402-406: PossiblePropertiesInfo.Hash64 currently dereferences
columns in info.Orders without nil checks, which can panic; update the loop in
PossiblePropertiesInfo.Hash64 (iterating info.Orders and each col) to guard
against nil column slots by checking if col != nil before calling col.Hash64(h)
and otherwise incorporate a stable placeholder into the hash (e.g., call
h.HashInt(0) or another deterministic value) so nil and non-nil columns produce
distinct, consistent hashes.

In `@pkg/planner/core/find_best_task.go`:
- Around line 90-99: The fallback branch uses ds.OutputNames() which may not
match ds.TblCols length/order and can corrupt PhysPlanPartInfo.ColumnNames;
change the err != nil branch in find_best_task.go so columnNames is derived from
ds.TblCols (e.g., map each ds.TblCols entry to its name or otherwise construct a
slice with the same length/order as ds.TblCols) instead of using
ds.OutputNames(), ensuring ColumnNames aligns with ds.TblCols when returning the
physicalop.PhysPlanPartInfo.

In `@pkg/planner/core/operator/logicalop/logical_union_all.go`:
- Around line 174-184: The method PreparePossibleProperties in LogicalUnionAll
sets hasTiflash true when childrenProperties is non-empty even if all entries
are nil; change the logic so HasTiflash is true only when there is at least one
non-nil child and every non-nil child's HasTiflash is true. In practice, inside
LogicalUnionAll.PreparePossibleProperties introduce a seenNonNil (or similar)
boolean, set it when encountering a non-nil child while computing hasTiflash =
hasTiflash && child.HasTiflash, and after the loop set hasTiflash = false if
seenNonNil is false before assigning p.hasTiflash and returning the
PossiblePropertiesInfo. Ensure you reference the existing symbols:
LogicalUnionAll.PreparePossibleProperties, childrenProperties, p.hasTiflash, and
base.PossiblePropertiesInfo{HasTiflash: ...}.

In `@pkg/planner/core/operator/logicalop/logical_window.go`:
- Around line 416-428: In PreparePossibleProperties of LogicalWindow, guard
against infos being empty or nil before reading infos[0]. Replace the
unconditional read of infos[0].HasTiflash with a check like if len(infos) > 0 &&
infos[0] != nil { p.hasTiflash = infos[0].HasTiflash } else { p.hasTiflash =
false } so PreparePossibleProperties (and the Orders/HasTiflash returned) can't
panic when called with no child PossiblePropertiesInfo; keep building result
from PartitionBy/OrderBy unchanged and return the same
base.PossiblePropertiesInfo structure.

In `@pkg/planner/core/operator/physicalop/physical_hash_agg.go`:
- Around line 57-62: The code uses canPushDownToMPP before checking session MPP
permission, letting MPP branches survive; ensure you first check IsMPPAllowed()
on the session context and only then evaluate util2.ShouldCheckTiFlashPushDown/
call checkCanPushDownToMPP to set canPushDownToMPP. Concretely, wrap the
existing ShouldCheckTiFlashPushDown + checkCanPushDownToMPP logic behind an
IsMPPAllowed() guard (using la.SCtx() and IsMPPAllowed()), or set
canPushDownToMPP = false unless IsMPPAllowed() returns true, then compute it;
apply the same change to the other similar blocks around lines with prop.TaskTp
== property.MppTaskType (the occurrences using canPushDownToMPP).

In `@pkg/planner/core/stats.go`:
- Around line 597-599: The keep-path logic may append tiflashPath even when
hasTiFlashReplica is false, causing a nil path to be injected; update the
branches that append tiflashPath (the one gated by
(ds.PreferStoreType&h.PreferTiFlash != 0 || isMPPEnforced) and the similar block
around the 701–705 region) to only append tiflashPath when hasTiFlashReplica is
true (or tiflashPath != nil), i.e., add a guard checking hasTiFlashReplica (or
tiflashPath != nil) before pushing/appending tiflashPath in the keep-path branch
so the path is never nil. Ensure the same guard is applied consistently in both
places.

---

Nitpick comments:
In `@pkg/planner/cascades/old/optimize.go`:
- Around line 374-377: When
expr.ExprNode.PreparePossibleProperties(expr.Schema(), childrenProperties...)
returns nil, don't skip the node; instead normalize to an empty
possible-properties value consistent with the core implementation: replace the
`continue` with assigning an empty instance of the same possible-properties type
that PreparePossibleProperties returns (i.e., create/assign the zero/empty
possible-properties value) to `exprInfo` so subsequent logic expects a non-nil
info object; update uses of `exprInfo` in the surrounding code in optimize.go
accordingly.

In `@pkg/planner/core/operator/logicalop/logical_projection.go`:
- Around line 334-335: Add a defensive guard at the start of
LogicalProjection.PreparePossibleProperties to avoid indexing into an empty or
nil childrenProperties slice: check if len(childrenProperties) == 0 or
childrenProperties[0] == nil and return nil (or an empty
*base.PossiblePropertiesInfo if the call-sites expect a non-nil pointer). This
touches the PreparePossibleProperties method and the childProperties variable
and ensures you don't dereference a missing child before using
base.PossiblePropertiesInfo.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce213aa and 18ce622.

📒 Files selected for processing (70)
  • docs/note/planner/rule/rule_ai_notes.md
  • pkg/executor/test/showtest/show_test.go
  • pkg/executor/testdata/slow_query_suite_out.json
  • pkg/expression/integration_test/BUILD.bazel
  • pkg/planner/cascades/memo/group_expr.go
  • pkg/planner/cascades/old/optimize.go
  • pkg/planner/cascades/old/optimize_test.go
  • pkg/planner/core/base/plan_base.go
  • pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_out.json
  • pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_xut.json
  • pkg/planner/core/casetest/cbotest/testdata/analyze_suite_out.json
  • pkg/planner/core/casetest/cbotest/testdata/analyze_suite_xut.json
  • pkg/planner/core/casetest/enforcempp/enforce_mpp_test.go
  • pkg/planner/core/casetest/enforcempp/testdata/enforce_mpp_suite_out.json
  • pkg/planner/core/casetest/enforcempp/testdata/enforce_mpp_suite_xut.json
  • pkg/planner/core/casetest/hint/testdata/integration_suite_out.json
  • pkg/planner/core/casetest/hint/testdata/integration_suite_xut.json
  • pkg/planner/core/casetest/physicalplantest/testdata/plan_suite_out.json
  • pkg/planner/core/casetest/physicalplantest/testdata/plan_suite_xut.json
  • pkg/planner/core/casetest/rule/testdata/predicate_pushdown_suite_out.json
  • pkg/planner/core/casetest/rule/testdata/predicate_pushdown_suite_xut.json
  • pkg/planner/core/casetest/rule/testdata/predicate_simplification_out.json
  • pkg/planner/core/casetest/rule/testdata/predicate_simplification_xut.json
  • pkg/planner/core/exhaust_physical_plans.go
  • pkg/planner/core/find_best_task.go
  • pkg/planner/core/generator/shallow_ref/shallow_ref_generator.go
  • pkg/planner/core/logical_plan_builder.go
  • pkg/planner/core/operator/logicalop/base_logical_plan.go
  • pkg/planner/core/operator/logicalop/hash64_equals_generated.go
  • pkg/planner/core/operator/logicalop/logical_aggregation.go
  • pkg/planner/core/operator/logicalop/logical_cte.go
  • pkg/planner/core/operator/logicalop/logical_datasource.go
  • pkg/planner/core/operator/logicalop/logical_expand.go
  • pkg/planner/core/operator/logicalop/logical_index_scan.go
  • pkg/planner/core/operator/logicalop/logical_join.go
  • pkg/planner/core/operator/logicalop/logical_plans_misc.go
  • pkg/planner/core/operator/logicalop/logical_projection.go
  • pkg/planner/core/operator/logicalop/logical_selection.go
  • pkg/planner/core/operator/logicalop/logical_sequence.go
  • pkg/planner/core/operator/logicalop/logical_sort.go
  • pkg/planner/core/operator/logicalop/logical_table_scan.go
  • pkg/planner/core/operator/logicalop/logical_tikv_single_gather.go
  • pkg/planner/core/operator/logicalop/logical_top_n.go
  • pkg/planner/core/operator/logicalop/logical_union_all.go
  • pkg/planner/core/operator/logicalop/logical_union_scan.go
  • pkg/planner/core/operator/logicalop/logical_window.go
  • pkg/planner/core/operator/logicalop/logicalop_test/BUILD.bazel
  • pkg/planner/core/operator/logicalop/logicalop_test/hash64_equals_test.go
  • pkg/planner/core/operator/logicalop/logicalop_test/logical_operator_test.go
  • pkg/planner/core/operator/logicalop/shallow_ref_generated.go
  • pkg/planner/core/operator/physicalop/base_physical_agg.go
  • pkg/planner/core/operator/physicalop/physical_hash_agg.go
  • pkg/planner/core/operator/physicalop/physical_limit.go
  • pkg/planner/core/operator/physicalop/physical_projection.go
  • pkg/planner/core/operator/physicalop/physical_selection.go
  • pkg/planner/core/operator/physicalop/physical_stream_agg.go
  • pkg/planner/core/operator/physicalop/physical_window.go
  • pkg/planner/core/operator/physicalop/task_base.go
  • pkg/planner/core/physical_plan_test.go
  • pkg/planner/core/planbuilder.go
  • pkg/planner/core/property_cols_prune.go
  • pkg/planner/core/rule/rule_partition_processor.go
  • pkg/planner/core/stats.go
  • pkg/planner/property/logical_property.go
  • pkg/planner/util/misc.go
  • pkg/sessionctx/variable/session.go
  • pkg/store/mockstore/unistore/cophandler/mpp_exec.go
  • tests/integrationtest/r/executor/prepared.result
  • tests/integrationtest/r/explain_easy.result
  • tests/integrationtest/r/planner/core/casetest/physicalplantest/physical_plan.result
💤 Files with no reviewable changes (3)
  • pkg/planner/core/casetest/physicalplantest/testdata/plan_suite_xut.json
  • pkg/planner/core/casetest/physicalplantest/testdata/plan_suite_out.json
  • pkg/planner/core/operator/logicalop/logical_expand.go
🚧 Files skipped from review as they are similar to previous changes (24)
  • pkg/planner/core/casetest/cbotest/testdata/analyze_suite_xut.json
  • pkg/planner/core/operator/physicalop/physical_stream_agg.go
  • pkg/planner/core/operator/physicalop/physical_window.go
  • pkg/planner/core/operator/logicalop/logicalop_test/BUILD.bazel
  • pkg/planner/core/operator/physicalop/physical_limit.go
  • pkg/planner/property/logical_property.go
  • pkg/planner/cascades/memo/group_expr.go
  • pkg/planner/core/casetest/cbotest/testdata/analyze_suite_out.json
  • pkg/planner/core/operator/logicalop/logical_sequence.go
  • pkg/planner/core/casetest/hint/testdata/integration_suite_xut.json
  • pkg/planner/core/operator/physicalop/task_base.go
  • pkg/planner/core/planbuilder.go
  • pkg/planner/core/operator/logicalop/logical_plans_misc.go
  • pkg/planner/core/casetest/hint/testdata/integration_suite_out.json
  • pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_out.json
  • pkg/planner/core/operator/physicalop/base_physical_agg.go
  • pkg/planner/cascades/old/optimize_test.go
  • pkg/planner/core/casetest/enforcempp/testdata/enforce_mpp_suite_xut.json
  • pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_xut.json
  • pkg/expression/integration_test/BUILD.bazel
  • docs/note/planner/rule/rule_ai_notes.md
  • pkg/planner/core/casetest/enforcempp/enforce_mpp_test.go
  • tests/integrationtest/r/explain_easy.result
  • pkg/planner/core/operator/logicalop/logical_sort.go

Comment on lines +268 to +285
func (la *LogicalAggregation) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
childProps := childrenProperties[0]
la.hasTiflash = childProps != nil && childProps.HasTiflash
// If there's no group-by item, the stream aggregation could have no order property. So we can add an empty property
// when its group-by item is empty.
if len(la.GroupByItems) == 0 {
la.PossibleProperties = [][]*expression.Column{nil}
return nil
la.PossibleProperties = base.PossiblePropertiesInfo{
Orders: [][]*expression.Column{nil},
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
resultProperties := make([][]*expression.Column, 0, len(childProps))
if childProps == nil {
la.PossibleProperties = base.PossiblePropertiesInfo{
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add length check before accessing childrenProperties[0].

Line 269 accesses childrenProperties[0] without first checking len(childrenProperties) > 0. While LogicalAggregation typically has one child, other operators in this PR (e.g., LogicalUnionScan) guard against this case for defensive coding.

🛡️ Proposed fix
 func (la *LogicalAggregation) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
-	childProps := childrenProperties[0]
-	la.hasTiflash = childProps != nil && childProps.HasTiflash
+	var childProps *base.PossiblePropertiesInfo
+	if len(childrenProperties) > 0 {
+		childProps = childrenProperties[0]
+	}
+	la.hasTiflash = childProps != nil && childProps.HasTiflash
 	// If there's no group-by item, the stream aggregation could have no order property. So we can add an empty property
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (la *LogicalAggregation) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
childProps := childrenProperties[0]
la.hasTiflash = childProps != nil && childProps.HasTiflash
// If there's no group-by item, the stream aggregation could have no order property. So we can add an empty property
// when its group-by item is empty.
if len(la.GroupByItems) == 0 {
la.PossibleProperties = [][]*expression.Column{nil}
return nil
la.PossibleProperties = base.PossiblePropertiesInfo{
Orders: [][]*expression.Column{nil},
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
resultProperties := make([][]*expression.Column, 0, len(childProps))
if childProps == nil {
la.PossibleProperties = base.PossiblePropertiesInfo{
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
func (la *LogicalAggregation) PreparePossibleProperties(_ *expression.Schema, childrenProperties ...*base.PossiblePropertiesInfo) *base.PossiblePropertiesInfo {
var childProps *base.PossiblePropertiesInfo
if len(childrenProperties) > 0 {
childProps = childrenProperties[0]
}
la.hasTiflash = childProps != nil && childProps.HasTiflash
// If there's no group-by item, the stream aggregation could have no order property. So we can add an empty property
// when its group-by item is empty.
if len(la.GroupByItems) == 0 {
la.PossibleProperties = base.PossiblePropertiesInfo{
Orders: [][]*expression.Column{nil},
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
if childProps == nil {
la.PossibleProperties = base.PossiblePropertiesInfo{
HasTiflash: la.hasTiflash,
}
return &la.PossibleProperties
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/planner/core/operator/logicalop/logical_aggregation.go` around lines 268
- 285, In LogicalAggregation.PreparePossibleProperties, avoid direct access to
childrenProperties[0] by first checking len(childrenProperties)>0; if empty,
treat childProps as nil and set la.hasTiflash = false (or preserve current
default), then proceed with the existing branches (the no-group-by branch and
the childProps==nil branch). Update references to childProps in that function
accordingly so behavior is unchanged when a child exists but safe when
childrenProperties is empty.

@hawkingrei
Copy link
Member Author

/test fast_test_tiprow

@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 1, 2026

@hawkingrei: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test build
/test check-dev
/test check-dev2
/test mysql-test
/test pull-br-integration-test
/test pull-build-next-gen
/test pull-integration-e2e-test
/test pull-integration-realcluster-test-next-gen
/test pull-lightning-integration-test
/test pull-mysql-client-test
/test pull-mysql-client-test-next-gen
/test pull-unit-test-ddlv1
/test pull-unit-test-next-gen
/test unit-test

The following commands are available to trigger optional jobs:

/test pingcap/tidb/canary_ghpr_unit_test
/test pull-br-integration-test-next-gen
/test pull-check-deps
/test pull-common-test
/test pull-e2e-test
/test pull-error-log-review
/test pull-integration-common-test
/test pull-integration-copr-test
/test pull-integration-ddl-test
/test pull-integration-ddl-test-next-gen
/test pull-integration-e2e-test-next-gen
/test pull-integration-jdbc-test
/test pull-integration-mysql-test
/test pull-integration-nodejs-test
/test pull-integration-python-orm-test
/test pull-mysql-test-next-gen
/test pull-sqllogic-test
/test pull-tiflash-integration-test

Use /test all to run the following jobs that were automatically triggered:

pingcap/tidb/ghpr_build
pingcap/tidb/ghpr_check
pingcap/tidb/ghpr_check2
pingcap/tidb/ghpr_mysql_test
pingcap/tidb/ghpr_unit_test
pingcap/tidb/pull_build_next_gen
pingcap/tidb/pull_integration_e2e_test
pingcap/tidb/pull_integration_realcluster_test_next_gen
pingcap/tidb/pull_mysql_client_test
pingcap/tidb/pull_mysql_client_test_next_gen
pingcap/tidb/pull_unit_test_next_gen
pull-check-deps
pull-error-log-review
Details

In response to this:

/test fast_test_tiprow

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 2, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: AilinKid
Once this PR has been reviewed and has the lgtm label, please assign benmeadowcroft for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hawkingrei
Copy link
Member Author

/retest

@hawkingrei
Copy link
Member Author

/retest

@hawkingrei
Copy link
Member Author

/retest

1 similar comment
@hawkingrei
Copy link
Member Author

/retest

@tiprow
Copy link

tiprow bot commented Mar 2, 2026

@hawkingrei: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 415ac9b link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 2, 2026

@hawkingrei: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-br-integration-test ce213aa link true /test pull-br-integration-test
idc-jenkins-ci-tidb/unit-test 415ac9b link true /test unit-test
pull-unit-test-next-gen 415ac9b link true /test pull-unit-test-next-gen
idc-jenkins-ci-tidb/check_dev_2 415ac9b link true /test check-dev2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/statistics do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wrongly generate the intermediate results for MPP in the optimizor

7 participants